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ABSTRACT 

We present a framework to conservatively estimate the probability that any particular planet- 
like transit signal observed by the Kepler mission is in fact a planet, prior to any ground-based 
follow-up efforts. We use Monte Carlo methods based on stellar population synthesis and Galactic 
structure models, and report a priori false positive probabilities for every Kepler Object of Interest 
in tabular form, assuming a 20% intrinsic occurrence rate of close-in planets in the radius range 
O.5i?0 < i?p < 20 Almost every candidate has FPP < 10%, and over half have FPP < 5%. 

This probability varies most strongly with the magnitude and Galactic latitude of the Kepler target 
star, and more weakly with transit depth. We establish that a single deep high-resolution image 
will be an extremely effective follow-up tool for the shallowest (Earth-sized) transits, providing the 
quickest route towards probabilistically "validating" the smallest candidates by potentially decreasing 
the false positive probability of an earth-sized transit around a faint star from >10% to <1%. On the 
other hand, we show that the most useful follow-up observations for moderate-depth (super-Earth and 
Neptune-sized) candidates are shallower AO imaging and high S/N spectroscopy. Since Kepler has 
detected many more planetary signals than can be positively confirmed with ground-based follow- 
up efforts in the near term, these calculations will be crucial to using the ensemble of Kepler data 
to determine population characteristics of planetary systems. We also describe how our analysis 
complements the Kepler team's more detailed BLENDER false positive analysis for planet validation. 



L INTRODUCTION 

In the wake of the first full release o f planet candidates 
from the K epler mission (|Koch et al.i l998: Bo rucki et all 
120081 120 lit ) , the study of the properties of cxoplanetary 
systems has entered a new era. For the first time there 
exists a large uniform sample of transiting planets largely 
unaffected by the detection challenges an d selection ef- 
fects inherent in ground-based searches (jGaudil 120051 : 
iGaudi et al.ll2005l e.g.), enabhng the first clear glimpse 
of the population of exoplanets down to the size of Earth 
as well as the first opportunity to study planet radii at 
large orbital separations. However, follow-up observa- 
tions to unambiguously confirm individual signals are 
time-consuming and difficult (or impossible), especially 
for fainter stars and smaller planets. Consequently, in or- 
der to understand what the population of Kepler transit- 
like signals can tell us about the population of exoplanets 
in general, the problem of astrophysical false positives 
must be understood. 

From the early days of planet transit searches, 
eclipsing binary systems masquerading as transit 
signa ls have plague d detection efforts ( K onacki et _ al., 
200l 10 'Donovan et al.i 120061 : IPoIeski'et^l.l 120101 : 
Almenara et al.l I2OO90 . Generally speaking, there are 
three types of astrophysical false positive: a grazing 
eclipsing binary, a dwarf star eclipsing a giant star, 
and a blended eclipsing binary system, which may be 
either a hierarchical triple system or an unassociated 
binary blended w ithin the aperture of a target star 
(|Torres et al.l[200l FI 
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The remarkable photometric precision that Kepler is 
delivering (Jenkins et al. 20 10bi . ^30 ppm) allows for an 
immediate simplificat ion of the false positive landscape. 
iBatalha et al.l ( 2010aD explain the multitude of ways that 
certain common false positive scenarios can be identi- 
fied from Kepler photometry alone. For example, graz- 
ing eclipsing binaries can be identified by their V-shaped 
transits, and the giant-eclipsed-by-a-dwarf scenario can 
be avoided both by the comprehensive w ork that went 
into assembling the Kepler Input Catalog ()Latham et al.l 
120051: IBatalha et aIll201Qb[ ) and by the ability to photo- 
metrically identify giants by their elev ated levels of stella r 
variability compared with dwarf stars (Basri e t al.ll2010( ) . 
Even many blended binaries can be identified from the 
Kepler photometry and astrometry alone, by looking for 
a shift in the c enter o f light, e.g the "rain diagrams" of 
iJenkins et al.l ()2010af) . However, some blended binary 
scenarios remain undetectable by this technique, espe- 
cially those in hierarchical triple systems, and so a de- 
tailed understanding of the false positive problem for Ke- 
pler requires a detailed understanding of the probability 
of encountering such blend scenarios. 

The Kepler team has proven that extremely careful 
and detailed analyses of individual systems can "val- 
idate" planets probabilistically by combining various 
follow-up observations with modeling the light curves 
of all possible false pos itive scenarios with the so-called 
BLENDER software (|Torres et al.l I2011D . However, 
this method is computationally expensive and labor- 
intensive, rendering it a time-consuming process, and 
only three BLENDER- valid ated planets having been re- 
veal ed to date: Kepler-9 d ([Torres et al.ll2011lli. Keoler- 
llf ()Lissauer et al.ll20TTl ). and Kepler-lOc (jFressin et al.l 



configurations mimicking transiting planet signals. For discussion 
of scenarios involving "blended planets," see Appendix. 
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1201 If ). With dedicated supercomputer resources coming 
online for the Kepler team's use, this number wiU cer- 
tainly rise, but the fact remains that it will be a long 
time before the BLENDER method can be applied to 
any large number of the Kepler candidates (Kepler team, 
2011, private comm.); in the meantime statistical inter- 
pretations of the candidate sample will rely on statistical 
assumptions of the false positive rate. 

There has been significant previous effort in the litera- 
ture dedicated to predicting the expected rate of false 
positive transit signals. Brown (2003) pioneered this 
work by predicting the rates of different types of false 
positives and Jovian planet detections for a variety of 
differ ent surveys, including t he then-future Kepler mis- 
sion. lEvans fc SackettI (|2010[ ) greatly extend this work by 
deriving detection and false positive rates from full-scale 
bottom-up simulations of synthetic ground-based transit 
surveys, taking into account all f a.lse positive p ossibilities 
and many details not included bv lBrownl ()2003l) . We con- 
tinue in the tradition of these authors with an analysis 
directly applicable to the Kepler mission, approaching 
from a slightly different angle. Instead of focusing on 
predicting an overall number or expected rate of planet 
detections or false positives, we instead seek a simple an- 
swer to the following question: "What is a conservative 
estimate of the probability that an observed apparent 
transit signal is in fact a true transiting planet?" By 
framing the issue in this manner we are able to sidestep 
the complex issue of detectability, as our analysis as- 
sumes a transit-like signal has been detected. 

Our philosophy in this work is not to take into ac- 
count all conceivable details of transit signals, but rather 
to consider only those which are most salient: the bright- 
ness of the Kepler target star, its location in the field, 
and transit signal depth. The details we choose not to 
address in this work (notably transit period and dura- 
tion) are those we judge would add uncertainty to our 
calculations while tending to only decrease our estimates 
of the false positive probability. Thus we are able to 
keep our analysis straightforward, yet remain confident 
we are calculating conservative upper limits to the prob- 
ability that any given Kepler transit signal might be a 
false positive. As we show in ^ and again in ^Sl even 
these conservative upper limits are enough to indicate 
that Kepler planet candidates will only rarely turn out 
to be false positives. 

2. BASIC BAYESIAN FRAMEWORK 

The probability that a given transit signal is of plane- 
tary origin may be expressed as the following, according 
to Bayes' theorem: 

^ , , , . Pr(signal I planet)Pr(planet) 

Pr planet signal - -^-^ -■ 1 

Pr(signal) 

In this framework Pr (signal | planet) is the probability of 
obtaining the observed signal given that there is a tran- 
siting planet on an orbit of a particular period. This 
factor is known as the likelihood of the signal under 
the planet hypothesis, and we will abbreviate it as Cpi. 
Pr(planet) is the probability of a star hosting a transiting 
planet (the occurrence rate of planets times the transit 
probability), which must enter the calculation as an a 
priori assumption. Thus we call this factor, according to 



Bayesian convention, the prior on planets, and designate 

it TTpl. 

Since there are only two possible origins of a transit- 
like signal (planet or false positive), the denominator of 
Equation [1] can be rewritten as marginalizing over the 
possible models: 

Pr(signal) = ^CpiTTpi + ^fpttfp. (2) 

Using our convention, Cpp and ttfp arc the likelihoods 
and priors for a false positive signal. The false posi- 
tive term can be further broken down accounting for the 
two specific false positive scenarios we are exploring: the 
blended eclipsing binary (BB) and the hierarchical eclips- 
ing triple (HT) , allowing Equation [1] to be rewritten as 
the following: 

Pr(planet | signal) = /^Pi%i _ (3) 

'LpiTTpi + >l,bb7''bb + -Lhtttht 

In general, the likelihoods depend on the particularities 
of the transit signal and enable discrimination between 
models depending on the transit depth, shape, or period. 
For now we ignore these details, assuming for the moment 
that we have no knowledge of the differences between 
the kind of transit signals to expect from planets and 
from false positives. This enables us to write a simplified 
version of Eq. |3l 

7r 1 

Pr(planet | signal) « . (4) 

TTpl + ttbb + ttht 

We then define the "false positive probability" (FPP) as 
the complement of this probability: 

FPP = 1 - Pr(planet | signal) (5) 

Thus, before considering any detailed information of a 
particular light curve, the probability that an observed 
transit signal is actually a false positive depends only 
on the relative occurrence rates of planets and the false 
positive scenarios. As mentioned above, TTpi is simply 
an assumed occurrence rate of planets times the transit 
probability; we explain how we determine ttbb and ttht 
in the following subsections. We explain first this priors- 
only framework in order to elucidate what dominates our 
final results, but in [J3]we will include the likelihoods we 
removed in Equation 21 taking into account dependence 
on the depth of the transit signal. 

2.1. Blended Binaries 

The probability of a transit-mimicking binary system 
to be blended within the aperture of a Kepler target star 
(tJ'bb) can be broken down into the following way: 

TTBB ~ Pr(blcnd) • Pr(appropriate eclipsing binary). (6) 

The first factor here is the probability for a potentially 
blending star to be projected within a given radius of a 
Kepler star, and the second is the probability for that 
star to be an eclipsing binary system that can appropri- 
ately mimic a planetary transit. 

To calculate these probabilities, we use the stel- 
lar population synthesis and Galactic structure code 
TRILEGAL (TRIdimensional modeL of thE GALaxy; 
iGirardi et al.l (|2005i) ). which is publicly available on the 
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TABLE 1 

Polynomial coefficients^ for Equation [Til 
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Fig. 1. — The probability for a possibly blending star to be pro- 
jected within 2" of a Kepler target star, as a function of Galactic 
latitude, as determined by TRILEGAL simulations. The plotted 
points are simulations; the lines are the exponential fits as de- 
scribed in Equation |8] 



weqj. TRILEGAL simulates the physical and photo- 
metric properties of the stars along a given line of sight , 
using various stellar evolution grids (jGirardi et al.ll200"2l : 
iChabrier et al.|[2000l ) and a Galactic model that includes 
a halo, thin and thick disks, and a bulge. All of our sim- 
ulations use a Chabrier lognormal IMF (|Chabriei]l200l 
and default TRILEGAL values for the Galactic structure 
parameters, including a squared hyperbolic secant struc- 
ture for the thin disk, an exponential structure for the 
thick disk, and an oblate spheroid for the halo. 

2.1.1. Probability of a blend 

The blend probability can be calculated by determin- 
ing the average sky density (e.g. stars per square arcsec) 
of stars faint enough so as not to be obviously present yet 
bright enough to possibly mimic a transit. The first con- 
dition is somewhat subjective, and we conservatively say 
that a star must be more than 1 magnitude fainter than 
the Kepler primary in order to be able to hide undetected 
within the Kepler aperture. In practice the true value 
is probably significantly fainter, but this approximation 
will lead to only a small overestimate of the blended star 
probability, as there are many more faint than bright 
stars. 

The faint condition can be determined by noting that 
in order for a blended eclipsing binary system to mimic 
a transit of fractional depth 6, the blended system must 
comprise more than a fraction 6 of the total flux within 
the Kepler aperture. This condition may be expressed 
as the following: 

niKMn - m^.targot = Am^ = "2.5 logiol^), (7) 

where mK,hin is the total apparent Kepler magnitude of 
the blended binary system and m^.targot is the mag- 
nitude of the Kepler target star. A transit depth of 
S = 0.01 corresponds to Atuk = 5; for S = 10^^, 

http: / / stev.oapd.inaf.it / cgi-bin / trilegal 





CO 


Cl 


C2 


C3 


C4 


A 


-2.5038e-3 


0.12912 


-2.4273 


19.980 


-60.931 


B 


3.0668e-3 


-0.15902 


3.0365 


-25.320 


82.605 


C 


-1.5465e-5 


7.5396e-4 


-1.2836e-2 


9.6434e-2 


-0.27166 


D 


2.7978e-7 


-1.5572e-5 


3.1957e-4 


-2.8543e-3 


9.3191e-3 


E 


-6.4215e-6 


3.5358e-4 


-7.1463e-3 


6.2522e-2 


-0.19743 



^ This table lists the polynomial coefficients for the empirical fits 
to how the blended binary false positive probability as a function of 
Galactic latitude changes with Kepler magnitude mx- A, B,C, D , 
and E are functions of niK, valid between rriK = 11 and mx = 16. 
The polynomials are of the form CQ+ciniK +C2m^ +C37n|, +C4m|^. 

AmK = 7.25; and for S = 10^^ (approximately an Earth- 
sized transit of a Solar-radius star), ArriK = 10. This 
means that no binary system fainter than rriK — 24 can 
possibly mimic a S = 10~^ transit around a niK = 14 
star, which is a typical magnitude for a Kepler target. 

Using TRILEGAL, we determine the sky density of 
stars in this magnitude range within in the Kepler field, 
and thus the probability of one by chance being projected 
close to a Kepler target star, by simulating a 10 deg^ field 
centered on the center of the Kepler field. We then sim- 
ply count the stars within the desired range of Kepler 
magnitude (which TRILEGAL provides). As a fiducial 
example, the average density of stars between nix — 15 
and rriK = 23.25, the range corresponding to a, S = 10~^ 
transit of a niK = 14 star, is 0.0085 stars-arcsec"^. The 
probability of any given small circle on the sky contain- 
ing one of these stars is then simply the area of the cir- 
cle multiplied by this density. Continuing this example, 
{niK = 14:,S — 10^"*) the probability of such a star being 
within 2" of a Kepler target star is 0.11. 

However, because the Kepler field is quite extended 
and centered only a few degrees off the Galactic plane, 
there is a considerable gradient in background stellar 
density across the field that must be accounted for. To 
accomplish this, we simulate 21 different 5 deg^ fields, 
each centered on one of the Kepler double-CCD squares. 
The resulting probabilities are plotted in Figure [1] as a 
function of Galactic latitude, for the magnitude ranges 
corresponding to niK — 11, 12, 13, 14, and 15. Recogniz- 
ing that this blend probability appears to be exponen- 
tially related to Galactic latitude b, and that the nature 
of the exponential depends on rriK, we fit an analytic 
expression of the following form; 



Pbic„d(fo, tuk) = CimK) + A(m;^)e-^/^("-), 



(8) 



where A, B, and C are all polynomial functions of Kepler 
magnitude, with the coefficients listed in Table [T] These 
fits are valid between mx values of 11 and 15, and b 
values between 7° and 20° (the approximate extent of 
the Kepler field) . Figure [2] graphically illustrates the 
behavior of Equation [51 

2.1.2. Probability of an appropriate eclipsing binary 

The probability that a blended star is an appropriately 
configured eclipsing binary system depends first on the 
binary fraction of blending stars, and secondly on both 
the distribution of binary properities and the magnitude 
of the Kepler target star. Of central importance is that 
in order for a blended binary to successfully mimic a Ke- 
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TABLE 2 

Predicted False Positive Probabilities: Basic Framework 



Experiment Threshold Blend Radius < # blends > ttbb ""ht '""pi FPP 



(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


Kepler 


10-* 


2" 


0.11 


1.1 X 10-4 


1.2 X 10-* 


0.01 


0.02 


Wide-Field Survey (e.g. HATNet) 


0.005 


14" 


1.67 


0.0014 


2.7 X 10-4 


5 X 10-4 


0.77 


CoRoT 


10-3 


10" 


2.71 


0.0035 


1.4 X 10-4 


0.01 


0.27 



Name of a transit survey 
Fractional depth detection threshold 

(3) Effective aperture size inside which a blended star might reside. Kepler can restrict this radius to 2" by centroid analysis. 

(4) Tjjg expected number of blending stars expected per aperture, based on estimates of the density of stars within the possibly- 
blending magnitude range for each experiment. 

(5) The rate we calculate for the blended eclipsing binary false positive scenario 

(6) The rate we calculate for the hierarchical eclipsing triple false positive scenario 
The assumed rate of detectable transiting planets 

False positive probability = 7rpi/(7rBB + ""HT + ""pi) 




12 13 14 15 

Kepler target magnitude (mj^) 
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Fig. 2. — The probability for a possibly blending star to be pro- 
jected within 2" of a Kepler target star, as a function of both 
Galactic latitude, and target star magnitude, as determined by 
TRILEGAL simulations. 



pier planet transit candidate, it must both have a diluted 
primary eclipse shallow enough to look like a planet and 
a diluted secondary eclipse shallow enough so as not to 
be detected. 

The apparent fractional "transit" depth of a blended 
binary system depends on the intrinsic binary system 
eclipse depth 6b, and the relative apparent magnitudes 
of the Kepler target star and the blended system: 



(9) 



The primary and secondary eclipse depths of the bi- 
nary system are the following: 



(ft) 



F1 + F2 



and 



(10) 



(11) 



Fi + F2 

where i?i and Fi are the stellar radius and flux in the 



Kepler band of the larger of the two stars, and R2 and 
F2 are of the smaller star. 

The conditions we define for a binary to be "appro- 
priate" are for the diluted primary eclipse depth to be 
between 0.02 and lO"** (shallow enough to look like a 
planet, but still detectable), and for the diluted sec- 
ondary to be shallower than 10^** (undetectable). We 
recognize that "detectability" of a transit is a function of 
more than just the transit depth, but for our purposes we 
use a depth of 10"^ as the detection threshold. A more 
detailed population study based on Kepler candidates 
should use rather the signal-to- noise ratio of a transi t 
as the criteron for detectability (jBeattv fc Gaudil [2008( 1. 
However, as our framework deals with how to interpret 
signals once they are detected, careful detectability anal- 
ysis in unnecessary. 

To calculate the probability of all these conditions 
being met (a star being binary and being "appropri- 
ate" ) , we use the TRILEGAL simulation s and assume bi- 
nary p roperties according to the work of iRaghavan et ahl 
(|20ig) . That is, we assume a flat ma s s rati o distribution 
between 0.1 and 1 ijRaghavan et al.l (|2010[ ) actually ob- 
serves the distribution to be flat between about 0.2 and 
1, but we extend it to 0.1 to be more conservative). 

For each star in a particular TRILEGAL line-of-sight 
simulation that lies in the appropriate magnitude range 
(t j2.1.ip . we first randomly assign it to be a binary or 
not and then calculate what the primary and secondary 
diluted depths would be if the system were eclipsing and 
blended with a Kepler target star of a particular magni- 
tude. Ri and Fi are provided by TRILEGAL0, and we 
determine R2 and F2 based on a randomly assigned mass 
ratio and the Padova models at the age of the primary. 
Given these system parameters, we can then randomly 
determine if each system undergoes a non-grazing eclipse, 
according to the probability that each system will be in 
such an orientation: 



Pr(eclipse) 



(12) 



where a is the orbital semi-major axis, determined from 
Kepler's law. 

^ This properly accounts for the possibility that the blend might 
be an evolved system; e.g. a dwarf star eclipsing a giant. 
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From this procedure, using a Kepler target star of 
rriK = 14, an orbital period of 10 days, and a line-of- 
sight simulation at the center of the Kepler field, we 
find that 1.25% of binaries have non-grazing eclipses and 
about 20% of those eclipsing binaries are "appropriate." 
Combined with a ^40% binary fractiorQ , this results 
in a probability of 0.001 for a star to be an appropriate 
eclipsing binary, giving a value of ttbb = 0.11 x 0.001 = 
1.1 X lO^"' for the center of the Kepler field. 

As in ij2.1.11 we empirically investigate how this proba- 
bility changes as a function of galactic latitude and target 
star magnitude. We find the behavior for any particular 
magnitude is well described by a shallow linear relation 
in b: 

Pr(appropriate eel. binary) = hD{mK) + E{mK), (13) 

where again the variation of the values of the coefficients 
D and E is modeled well with a polynomial in rriK (Table 

m- 

Multiplying Equation [l3] with Equation |S] then gives 
a full analytic expression for the probability of a star of 
given Kepler magnitude at a given Galactic latitude to 
be blended with an eclipsing binary system able to mimic 
a planetary transit: 

TTBBimK,b) - [C7(tok) + A(mK)e-^/^('"^)] x 

[bD{mK) + E{mK)] , (14) 

where A, B, C, D, and E are polynomial functions of uik 
with coefficients given in Table [TJ 

2.2. Hierarchical Triples 

The probability that a Kepler target star is in fact a 
hierarchical triple system configured such that it might 
be able to mimic a planetary transit (ttht) can be broken 
down as follows: 

ttht = Pr(triple) ■ Pr(eclipsing and appropriate). (15) 

The fi rst factor is simply the f requency of triple systems, 
which iRaghavan et al.l ()2010l ) determine to be 8% for 
sun-like stars. The fraction of triple systems that are 
of appropriate configuration can be determined by using 
the same conditions as we used above in i j2.1.2l That is, 
we require the diluted eclipse depths (Eqs. CTTTj) to be 
between 0.02 and 10~^, except this time one of the three 
triple components provides the diluting fiux. 

We assume two different hierarchical possibilities for 
triple systems. Referring to the three components in or- 
der of descending mass as A, B, and C, the triple system 
may either be set up as A -I- BC, where B & C are the 
closer potentially eclipsing pair and A is the diluting star, 
or as Ac -I- B, with A & C as the closer pair and B di- 
luting. We ignore the case AB -I- C because the faintest 
component being the diluting star would be unable to 
mimic a planet transit. 

We calculate the probability that a triple system will 
be eclipsing and "appropriate" (again assuming a 10-day 

^ To be precise, we actually use a binary fraction function that 
increases with stellar mass: 40% for M < Mq, 50% for Mq < M < 
I. SMq, and 75% fo r M > 1.5Mq, roughly adapted from Figure 12 
in IRaehavan et al] 1120101 ) . This is a conservative estimate of the 
binary fraction, as the Raghavan Figure includes multiple systems 
as well as binaries. 




Kepler target magnitude (m^f ) 



Fig. 3. — The false positive probability of a Kepler candidate, 
according to our basic framework (i.e. independent of (5), as a func- 
tion of target star magnitude niK and galactic latitude. A planet 
occurrence rate of 20% is assumed. This plot assumes that Ke- 
pler is able to internally restrict the radius inside which a possible 
blended binary might reside to 2". 

orbit) as follows: 

Pa^ J j A{MA,qi,q2)'^qdqi<^qdq2. (16) 

A(Myi, gi, (72) equals 1 if the system is eclipsing and 
can mimic a transit and if not, and the mass ratios 
qi = Mb /Ma and 92 (either Mc/Ma or Mc/Mb, with 
50 /50 odds) determine the architecture of the triple sys- 
tem. $q is the mass ratio distribution that we used in 
i i|2. 1.21 (flat between 0.1 and 1). We assign the radius and 
flux of each component according to the Padova model 
grids in order to calculate both the non-grazing eclipse 
probability and the diluted eclipse depths. Evaluating 
this integral numerically we obtain pa = 0.0015, which 
results in ttrt = 0.08 x 0.0015 = 1.2 x 10"^. 

Unlike the blended eclipsing binary scenario, the prob- 
ability of a target being a hierarchical eclipsing triple 
does not depend either on galactic latitude or apparent 
magnitude. There is a very weak dependence on stellar 
mass of the primary, but for our calculations we just as- 
sume that all target stars have masses close to 1 Mq, 
which is reasonable as Kepler is specifically targeting 
solar-type stars. 

2.3. Basic Framework: Summary and Discussion 

Now that we have determined the priors for both 
false positive scenarios, we are able to evaluate the FPP 
(Equations |4] and ^ by assuming a frequency of close-in 
planets. We adopt a 20% frequency accor ding to the re- 
sults o f the NASA-UC Eta-Earth Survey of lHoward et all 
(|2010l ). This conservative estimate of 20%, combined 
with a 5% transit probability for a planet on a 10-day 
orbit (the period we have been assuming up to now) gives 
TTpi = 0.01. From a planet detection standpoint, this re- 
sult is promising, as it gives a 98% probability that an 
observed planet-like transit signal around a mx = 14 
star in the middle of the Kepler field is authentic, and 
thus an FPP of only 2%. Because of the variation of 
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the background stellar density across the field, this value 
varies with Galactic latitude and rriK, as shown in Fig- 
ure [3l This is a remarkable result, as it indicates that al- 
most every signal that passes the Kepler astrometric and 
photometric false positive tests is likely a planet transit, 
before any RV confirmation attempts. 

One might rightly pause at this juncture and 
wonder how the false positive probability for Ke- 
pler can be so low. After all transit searches up 
until now, both ground-based (e.g. HAT, WASP) 
and space-based (e.g. CoRoT) bccn_glagucd by false 
positives ( Konacki ct al. 2003; O 'D onovan et al.l 120061 : 
iPoleski et al.. .2010.; Almenara et ah i2009D . To address 
this, consider what Equation|4]would say about the prob- 
ability of a transit signal being true for those experi- 
ments; these results are summarized in Table [2l 

Taking the Hungarian-made Automated Telescope 
Network (HATNet) as an example of a ground-based sur- 
vey, we note that its 11cm telescopes pr oduce a photo- 
metr ic aperture of about 14" in radius (jHartman et al.l 
120041) . Using this radius and a depth of 0. 5% a s a detec- 
tion threshold, we repeat the analysis of §2.11 using the 
linc-of-sight simulation at the center of the Kepler field 
for the sake of comparison. For the probability of a pos- 
sibly blending star to be within the aperture we obtain 
1.67, which must obviously now be interpreted as an av- 
erage number of blending stars per aperture instead of 
a probability. For the probability of a blending star to 
be an appropriate eclipsing binary we obtain 8.4 x 10""*, 
giving TTBB = 1-67 x 8.4 x lO^'^ = 0.0014. Following 
^'2.2\ we calculate ttht — 2.7 x 10^"*. Finally, taking into 
account that the probability of a sun-like star hosting a 
planet easily detectable by this survey is only about I'JcQ, 
then vTpi = 0.01 x 0.05 5 x 10""* for this survey. This 
results in an FPP of 0.77 for a hot Jupiter-like transit sig- 
nal f or a HAT-like ground -based search, according well 
with iLatham et al.l ()2009l ) , who describe the results of 
follow-up efforts of a sample of transit candidates, eight 
of which turned out to be blended binaries and one to be 
a planet. 

The space-based mission CoRoT dBaglm! '2003) has 
also had difficulties with false positives. Though it ob- 
tains much better photometric precision than a ground- 
based search and benefits from uninterrupted observ ing, 
its large, 320 arcsec^ aperture (jAlmenara et al.|[2009[ ) re- 
sults in an expected number of 2.71 blended stars for a 
mx = 14 target star. In addition, its photometric preci- 
sion is about one part in 10'^, resulting in ttbb = 0.0035, 
and ttht = 1-4 x lO"**. Assuming then a 20% occur- 
rence rate of planets detectable by CoRoT, this gives an 
FP P of 0.27. At firs t this appears to somewhat contra- 
dict I Almenara et al.l ()20 09) , who reported 6 planets and 
25 diluted binaries among CoRoT's "solved candidates" 
(ignoring the "undiluted binary" category, as we are not 
considering that possibility for Kepler ). However, if one 
considers how much easier (and faster) it is to identify 
a false positive than to positively confirm a planet, this 
prediction can certainly be consistent with these results, 
as only 49 of their 122 candidates had been solved at the 
time. In fact, a prediction of our methods is that many 
of the unsolved CoRoT candidates are indeed planets. 

Another reasonable question to ask is how uncertain- 

^ for P < 11. 5d and M > 0.5M [Gumming et al] l(200l) 




Planet occurrence rate 



Fig. 4. — False positive probability as a function of assumed 
planet occurrence rate, for a m/f = 14 target star in the center 
of the Kepler field. The occurrence rate of planets detectable by 
Kepler is not known for s ure, but RV surveys, e specially the NASA- 
UC Eta-Earth Survey of lHoward et al] 1120101 ), have made inroads 
in measuring the fraction of stars hosting low-mass planets. The 
hashed area below 9% represents the occurrence rate of planets 
with P < 50 days that is ruled out with 95% confidence by J7carthi 
counting only the firm detections, and not correcting for complete- 
ness. The central hashed area represents the 95% confidence region 
calculated including candidate planets and completeness correc- 
tion, for minimum masses greater than 3 Af^ . Extrapolating their 
observed mass distribution down to 0.5 brings their total esti- 
mated planet occurrence rate to 43%. Overall, this plot shows that 
our derived FPP cannot reasonably be any higher than 5% if our 
planet occurrence estimate is incorrect, and will likely be lower. 

ties in our models and assumptions propagate through 
to uncertainties in FPP. This is challenging to address 
exactly, as our analysis rests on the results from TRI- 
LEGAL simulations, stellar model grids, and various as- 
sumptions about multiple star systems. Rather than at- 
tempt a detailed start-to-finish treatment of all the un- 
certainties, we instead investigate what happens if we 
artificially inject fractional uncertainties into our prior 
calculations and simulate the results according to our 
analytic fits. We find that 20% fractional uncertainties 
in background stellar density, appropriate eclipse prob- 
ability, and hierarchical eclipsing triple probability lead 
to 17% fractional uncertainty in FPP. This is a fiducial 
example, and the uncertainty in FPP scales linearly with 
these component uncertainties. 

One might also wonder how sensitive our derived FPP 
for Kepler is to the assumption that 20% of stars host 
planets, as well as how justifiable such an assumption 
may be. We address these questions in Figure ID A 20% 
occurrence rate lies in the middle of the measured oc- 
currence rate of planets with minimum masses > 3M(^ 
and perio ds < 50 days f rom t he NASA-UC Eta-Earth 
Survey of [Howard et al.l (|2010D . In addition, even the 
most pessimistic interpretation of the results from rycarth 
allows for a minimum of a 9% occurrence rate, which 
would still imply an FPP of only 7%. More likely, the 
true occurrence rate is somewhat higher than our as- 
sumption, if not as high as the '^40% implied by a naive 
extrapolation of the observed power law-like distribution 
down to 0.5 A/ffi. We note that the NASA-UC Eta-Earth 
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Survey, as with all RV surveys, is only able to measure 
minimum masses and thus that the interpretation of the 
true mass of any individual detection is dependent on an 
assu mption of the overa ll form of the planet mass func- 
tion ( Ho fc Turneii '2010'). However, when an ensemble 
of minimum mass measurements is available and its dis- 
tribution resembles a power law with index a < — 1, the 
most likely explanation is that the true mass function 
follows a similar power-law shape. 

In summary we may say that several factors contribute 
to Kepler being able to minimize the false positive prob- 
lem compared to previous transit surveys. First, its abil- 
ity to astrometrically rule out wide blend scenarios helps 
mitigate the issue of blended binaries. Secondly, its pho- 
tometric precision enables it to identify many false pos- 
itives based on their secondary eclipses. And lastly, Ke- 
pler is sensitive to lower-mass planets, which are signif- 
icantly more common than the larger planets to which 
ground-based surveys are sensitive. 

3. DETAILED FRAMEWORK: CONSIDERING 
TRANSIT DEPTH 

We note that we have not yet discussed any details of 
the transit signal besides its existence, though some of 
these details may be important. For example, one might 
expect positive blended binaries to be more common at 
shallower depths (since faint stars are more common than 
bright stars, and thus more likely to be blended), which 
might make the BB scenario more of a problem for earth- 
sized transit signals. We have also assumed that planets 
and eclipsing binaries have the same eclipse probability 
(allowing us to cancel the likelihood factors in Equation 
[3]), though this is not exactly true either, as both the 
orbital separations of the systems and the radii of the 
objects are different. And finally, for fainter stars and 
shallower eclipses, it may be more difficult for internal 
Kepler procedures to astrometrically identify blends. 

With these concerns in mind, we may pursue a more 
detailed analysis of any particular transit. There are 
many features of transit light curves that might all be 
used in this exercise, but for now we only take into ac- 
count the depth of the signal, as that is the most easily 
measured and easily understood quantity. In this case. 
Equation [3] becomes: 



2.0 



Pr(pl|5) = 



l~'p\{5)'n:^y 



(17) 



Here the likelihood functions provide a means to quan- 
tify the extent to which the conclusions of our simple 
framework may change as a function of transit depth 5. 

Figure [5] shows the likelihoods that we estimate for 
the three different scenarios as a function of depth. The 
distribution of depths for the blended binary and hier- 
archical triple scenarios are determined from the same 
calculations that we used to compute the priors, except 
rather than just counting all the systems that give depths 
that are both planetary and detectable, we keep track of 
the depth of each simulated false positive and build up 5 
distributions. 

We calculate the 5 distribution for planets assuming a 
simple continuous power law distribution of planet radii 
{dN/dRp oc Rp^) between 0.5 and 20 R^, and setting 
6 — [Rp/Ri,)"^. While a more sophisticated treatment 




planet (assumed) 
blended binary 
hierarchical triple 



Fig. 5. — Distributions of apparent "transit" depths 5 for different 
scenarios. Tfie blended binary and fiierarchical triple distributions 
are based on TRILEGAL simulations with the binary distribution 
assumptions discussed in J2] Examples S distributions are given 
for different target star properties, showing how the blended bi- 
nary scenario depends on target star apparent magnitude and how 
the hierarchical triple distribution depends on intrinsic target star 
mass. The planet distribution comes from an assumption of a con- 
tinuous power law in planet radius dN/dRp oc Rp^ , including ran- 
dom statistical dilution by binary companions. Note how blended 
binaries become insignificant for deep transits and how eclipsing 
triples become insignificant for shallow transits. 
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Fig. 6. — As stars get fainter and transit signals get shallower, 
the ability for Kepler to observe a centroid shift indicative of a dis- 
placed blended eclipsing b inar y decreases. We parametrize this ef- 
fect according to Equation [18] The plateau towards shallow depths 
is a result of the maximum blending area for this example being 
set to an aperture of 8 Kepler pixels; the location of this plateau 
for any particular target will depend on its aperture size. This 
plot is made according to a galactic latitude in the middle of the 
Kepler field; other latitudes will scale appropriately according to 
the varying stellar density. The planet radii are marked assuming 
a Solar-radius star. 

might involve adopting a planet mass distribution ac- 
cording to RV surveys aiid theo r etical mass-radius r e- 
lations (e.g. iFortnev et all (|2007[ ): iSeager et all (|2007D V 
the number of assumptions required for these models and 
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the fact that they do not generally include significant 
atmospheres for super-Earth-type planets suggests that 
such efforts are not warranted. In addition, the current 
uncertainties in stellar radius of the Kepler candidate 
host stars further blur the mapping from S to Rp. Thus 
the main role of the S distribution we adopt for planets 
is to encapsulate the assumption that smaller planets are 
more common than lar ge ones, which is co nsistent with 
radial velocity surveys (jHoward et al.|[20To( ). 

Another consideration that should vary with 6 is the 
ability of Kepler to astrometrically identify displaced 
blends. In f}2] we assumed a radius of 2" inside which 
a blend might reside. However, this radius should in- 
crease as transits get shallower and stars get fainter and 
the signal-to-noise of the centroid shift signal decreases. 
This is a question that the Kepler team should be able 
to address using simulations of its offset-detecting proce- 
dures, but for our purposes we use the radi us that the Ke- 
pler t eam obtained for Kepler 10-b (1'.'17) (jBatalha et al.l 
[20n and assume scaling with S and tuk as follows: 

r=l"17v/lO-"-^("-'"^)( ^,,/^Q_, ) \ (18) 

with 11 being the value for Kepler 10. To be conser- 
vative we set the minimum r to be 2" if this expression 
gives a smaller value. On the high end, we cap the radius 
at 6'.'4, corresponding an area equivalent to 8 Kepler pix- 
els, a typical aperture size (though for any particular tar- 
get this will vary). The square root factor accounts for 
a diminishing number of photons received as the target 
star gets fainter, and the inverse relationship with delta is 
because the centroid shift scales as S: AC ~ (S ■ r. Figure 
[6] illustrates this effect; bright stars and deeper transits 
give Pr(blend) as determined in i i2.1.1l but as the target 
star gets fainter and the signal shallower, the expected 
number of possibly blending stars begins to increase sub- 
stantially, up to the point at which our calculated blend 
radius exceeds the maximum assumed 8-pixel aperture 
area. 

4. RESULTS 
4.1. General 

The adoption of these more detailed considerations en- 
ables us to estimate the FPP as a function of S for a star 
of given apparent Kepler magnitude. Galactic latitude, 
stellar radius, stellar mass, and aperture size. This is 
illustrated for a fiducial mx — 14 Sun-like star in the 
middle of the Kepler field (Figure [7]) , assumed to have 
an 8-pixel aperture. We first note that over the whole 
range of S, the FPP generally remains low, indicating 
that these additional considerations do not significantly 
change the qualitative conclusions we reached within the 
simple framework. The majority of transit signals in the 
Kepler data release will be actual planets. 

We next draw attention to several features of the plot. 
First, we note that any approximately Jupiter-sized can- 
didate, whether around a bright star or faint, is al- 
most certainly a planet. This is simply because it is 
extremely difficult to arrange a diluted binary system 
with a Jupiter-sized primary depth and an undetectable 
secondary eclipse. 

The second feature of interest is the peak in FPP 
around log 5 = —3.1, corresponding to about '^3 i?® for 
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Fig. 7. — The probability that a signal of a given depth will be 
a false positive, shown for both an m/f = 11 and an mx = 15 
star with Solar properties, Galactic latitude in the center of the 
Kepler field, and an 8 pixel Kepler aperture. A overall planet 
occurrence rate of 20% and a planet radius function dN/dR oc 
are assumed. Note the peak in false positive probability peaks 
around depths corresponding to about S-Rq , due to the peak there 
in the hierarchical triple 5 distribution. For the fainter star, the 
false positive probability increases for shallower transits because 
it becomes more difficult for Kepler to rule out displaced blended 
binaries via astromctry. However, if a single high-resolution image 
is able to restrict the possible blend radius to 2", then the FPP 
for small 5 signals is drastically reduced. The exact shape of this 
curve will vary with target star parameters as the shapes of the 5 
distributions for the false positive scenarios change (see Figure O 

a Solar-radius host star. The origin of this peak may be 
understood by examining Figure [5] and recognizing that 
this corresponds to the peak 5 which we predict hierar- 
chical triple false positives to populate for a Solar-mass 
target star. This raises the priors-only estimated FPP 
from i )2.3l by a factor of about 3 for planet candidates 
slightly smaller than Neptune. We also note that FPP 
for signals deeper than this peak is nearly independent 
of target star apparent magnitude; this is because the 
eclipsing triple scenario dominates false positives in this 
regime and the contribution from blended binaries is neg- 
ligible. 

The third significant feature is the rise in FPP towards 
shallow depths for a target star of Kepler magnitude 
rriK = 15, and a similar, though smaller, rise for the 
brighter rriK = 11 star at the very shallowest depths. 
This is caused by the effect illustrated in Figure HI where 
the radius outside of which blends may be ruled out 
by Kepler astrometry alone should increase with smaller 
eclipse depth and fainter stars. Figure [7] also illustrates 
the power of a single deep high-resolution image of any 
low-amplitude candidate system: any progress in shrink- 
ing the radius inside which a blended binary might reside 
will significantly decrease the FPP for Earth-sized transit 
signals, under the assumption that the occurrence rate of 
planets rises toward smaller masses, as we have assumed. 

We note that the plots in Figures [51 H] and [7] are only 
for particular chosen values of magnitude and a single 
Galactic latitude in the middle of the Kepler field, as 
well as for particular choices of stellar properties. We 
present a more comprehensive illustration of the FPP 
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Fig. 8. — These plots illustrate the behavior of Kepler false positive probability (FPP) as a function of target star magnitude (mx) 
and Galactic latitude, for three particular choices of transit depth 5, all plotted with the same color scale. A planet occurrence rate of 
20% is assumed, and the target star is fixed to have Solar mass, Solar radius, and a photometric aperture of 8 pixels. Th ese plots are 
similar to Figure O except for they take into account both the changing blend radius as a function of mx and S (Equation I18I I and the 
relative likelihoods of false positives and planets at the chosen values of 5. All three 5 values show increasing FPP towards fainter target 
stars and lower galactic latitudes, though the strength of the gradient decreases for the deeper signals, as the relative importance of the 
hierarchical triple scenario increases. Dotted lines show the FPP contours if the blend radius were restricted to 2", illustrating the power 
of a single deep high-resolution observation for the shallower signals. For the Neptune-depth signal, however, as the FPP is dominated by 
the hierarchical triple scenario, restricting the blend radius to 2" has less dramatic an effect (FPP becomes about 4% in this case, and 
chances very little across the parameter space). 



manifold in Figure 8, choosing three specific values of S 
to illustrate how the FPPs for different types of signals 
vary with target star magnitude and Galactic latitude. 
We fix the target star to have Solar properties in these 
examples. 

Earth-sized transits show a steep gradient across the 
field and towards fainter stars; this is a result of increas- 
ing contribution to the FPP from blended binaries (see 
Figure [2]), combined with the increased blend radius for a 
shallow transit (Figure |6]) . This gradient is shallower for 
a 2Rq signal and almost disappears for a Neptune-sized 
signal, because of the growing contribution of the hierar- 
chical triple scenario. These plots also illustrate the po- 
tential power of deep high-resolution imaging follow-up 
observations. If such an image is taken and no compan- 
ion is found outside a radius of a few arcseconds, then 
that dramatically reduces the FPP for shallow signals, 
as illustrated with the dotted contours in Figure 8. 

4.2. Application to Kepler Candidates 

We apply the framework discussed above to calculate 
the FPP for every Key l er Ob ject of Interest (KOI) pub- 
lished in lBorucki et al.l (|2011[ ): these results are summa- 
rized in Table[3l For each KOI we generate individualized 
S distributions for the different false positive scenarios 
using the relevant Kepler magnitude. Galactic latitude, 
stellar parameters from the Kepler Input Catalog. We 
then calculate the FPP using the actual area of the pho- 
tometric aperture, as determined from the publicly avail- 
able pix el data for ea c h KO I and the transit depth as 
given in iBorucki et all (|2011[ ). The distribution of FPPs 
is illustrated in Figure [51 

In Table [3] we list the KOI parameters relevant to the 
FPP calculation, the calculated FPPs, and the values 
of the intermediate factors in the calculation, which we 
summarize as Lpi, Lbb and Xht, where 

ipi = £pi{S)npi = /pi . Pr(Transit) • $pi(log 6), (19) 

where /pi is the overall planet occurrence frequency, 
Pr(Transit) is the geometric transit probability, and 
$pi = dN/dlog6 is the probability density function for 
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Fig. 9. — The distribution of false positive probabilities 
(FPPs) among the 1235 Kepler planet candidates announced in 
IBorucki et al.l 1)20111 ). FPP for each candidate is calculated individ- 
ually, taking into account the apparent Kepler magnitude. Galactic 
latitude, mass and radius of the host star, the depth of the signal, 
and the number of pixels contained the optimal aperture used for 
Kepler photometry. Almost all (1193) have FPPs less than 10%, 
and over half (668) have FPPs less than 5%. The mean FPP of 
the sample is 4%, indicating that we expect there to be fewer than 
50 false positives among the candidate sample. 



log (5. Thus 



FPP = 1 



^pl 



Lpi + Lbb + Lht 



(20) 



where Lbb and Lht are the corresponding terms for the 
two false positive scenarios. 

We list these individual components in the table pri- 
marily because the FPP calculation fundamentally de- 
pends on assumptions of the planetary occurrence rate 
and radius distribution, and different assumptions will 
result in different FPPs. Though we show in Figure |4] 
that these assumptions are unlikely to dramatically af- 
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TABLE 3 

False Positive Probabilities for Kepler Planet Candidates-'- 



KOI 


5 


mx 


# pixels 


b 


P 




R* 








FPP 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


(8) 


(9) 


(10) 


(11) 


(12) 


371.01 


l.lle-03 


12.19 


19 


5.94 


278.000 


1.33 


3.01 


2.4e-07 


1.4e-06 


0.00023 


0.01 


372.01 


7.64e-03 


12.39 


12 


6.82 


125.612 


1.05 


0.95 


1.5e-23 


4.5e-09 


0.00014 


< 0.01 


373.01 


5.97e-04 


12.77 


8 


11.79 


135.194 


1.11 


1.30 


1.2e-06 


2.1e-05 


0.0005 


0.04 


374.01 


5.95e-04 


12.21 


13 


13.68 


172.673 


1.11 


1.26 


2.8C-07 


1.9e-05 


0.00042 


0.04 


375.01 


4.70e-03 


13.29 


13 


15.91 


220.000 


1.07 


1.04 


3.9e-10 


8.3e-07 


0.00012 


0.01 


377.01 


6.94e-03 


13.80 


6 


14.49 


19.258 


1.00 


0.68 


1.5e-ll 


1.5e-06 


0.00052 


< 0.01 


377.02 


6.24e-03 


13.80 


6 


14.49 


38.912 


1.00 


0.68 


l.le-08 


1.5e-06 


0.00034 


< 0.01 


377.03 


2.25e-04 


13.80 


6 


14.49 


1.593 


1.00 


0.68 


0.00023 


0.00025 


0.016 


0.03 


379.01 


2.51e-04 


13.32 


10 


9.61 


6.717 


1.19 


1.59 


8.6e-05 


0.00026 


0.0059 


0.06 


384.01 


1.76e-04 


13.28 


8 


8.46 


5.080 


1.09 


1.22 


0.00031 


0.00015 


0.0083 


0.05 


385.01 


2.69e-04 


13.44 


5 


9.85 


13.146 


1.04 


1.04 


4.5e-05 


9.3e-05 


0.0035 


0.04 


386.01 


8.45e-04 


13.84 


5 


8.61 


31.158 


1.11 


1.12 


1.9e-06 


3.2e-05 


0.0011 


0.03 


386.02 


6.60e-04 


13.84 


5 


8.61 


76.735 


1.11 


1.12 


2.9e-06 


2.8e-05 


0.00072 


0.04 


387.01 


9.41e-04 


13.58 


9 


13.50 


13.900 


0.69 


0.74 


2.9e-06 


l.le-05 


0.0018 


0.01 



^ Here is printed only a portion of the table to show its format and contents; all 1235 candidates are listed in the full version of the 
table, available online at exoplanets.org/data/KOIFPPtable.txt. 

(I) KOI identifier, from lBorucki et al.l 120111 ) 
transit depth 

Kepler magnitude 

Size, in Kepler pixels (4" square each) of the photometric aperture, according to the publicly available pixel data. 

Galactic latitude of target star, in degrees 
(6) Period of candidate, in days 

Stellar mass, according to the Kepler Input Catalog (KIC) 

Stellar radius, according to the KIC 

Likelihood X prior for the blended binary scenario 
f^*^' Likelihood X prior for the eclipsing hierarchical triple scenario 
Likelihood X prior for the transiting planet 

(II) False positive probability = 1 - Lpi/(Lpi + Lbb + ^HT) 



feet the final FPP numbers, one could in principle calcu- 
late ipi based on different assumptions and recalculate 
FPP, given all the components. 

5. DISCUSSION: RELATIONSHIP TO "BLENDER" 

The FPP analysis we present in this paper is not the 
first false positive analysis that has been done regarding 
Kepler candidates. In fact, the Kepler team h as statisti- 
cally "validated" t hree planets: Kcplcr-9d (Torre s et al.l 
120111). Kepl er- llg (ILissauer et al.. 2011) . and Kepler-lOc 
( Fressin et al.l [20111 ) by demonstrating that the chance 
of any of those signals being due to a false positive is 
low enough to consider the candidate a bona fide planet. 
This has been done using the procedure the Kepler team 
has named BLENDER. 

BLENDER attempts to directly model the candidate 
light curve using every conceivable false positive scenario, 
informed by high-resolution imaging follow-up observa- 
tions. The goodness-of-fit of the false positive models is 
then compared to the best-fit planetary model. The false 
positive scenarios that cannot fit the light curve as well as 
as a transiting planet model are rejected. The a priori 
likelihood of the remaining scenarios (those false posi- 
tive scenarios that provide comparable-quality fits to the 
light curve) is then assessed relative to the likelihood of 
a bona fide transiting planet, and if the planetary expla- 
nation is much more likely, then the planet is considered 
validated. 

As can be inferred from the fact that the Kepler team 
has published only three validated planets to date out of 
over 1200 planet candidates that have been made pub- 
lic, BLENDER is a very time-consuming procedure, be- 



ing both computationally expensive and labor-intensive. 
Relying on extensive modeling of individual light curves 
and requiring a suite of follow-up observations to be most 
effective, it can only be applied to single KOIs on an in- 
dividual basis. 

If BLENDER may be characterized as a "deep and 
narrow" false positive analysis tool, the FPP analysis 
we present in this paper might be described as "shallow 
and wide." It takes only 15 seconds per candidate for 
us to generate the (5-distributions required to calculate 
the individualized FPP numbers listed in Table [31 which 
makes our analysis easily and immediately applicable to 
all the KOIs, whereas BLENDER takes months of com- 
putation and analysis per candidate. On the other hand, 
BLENDER takes into account all possible information 
about each KOI (detailed light curve shape, AO imag- 
ing, multiwavelength transit information, etc.), whereas 
we only consider the depth of the transit signal and the 
properties of the target star. 

Another way to think of the relationship between our 
FPP analysis and BLENDER is that if BLENDER is 
a iV-step procedure, our analysis is step N . We ignore 
most of the detail of the light curve and make no use of 
any follow-up observations, but go straight to the a pri- 
ori likelihood calculation and do that step as carefully 
as possible. What is remarkably encouraging for the 
Kepler mission is that even this "shallow," single-step 
analysis is enough to determine that the false positive 
probability for almost every KOI is less than 10%, and 
for over half the KOIs is less than 5%. 

If our analysis is step N of the BLENDER process, 
how would the first — 1 steps be incorporated into 
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the present analysis to improve upon the FPPs pubhshed 
here? First, consider that if a; = Lpi and y = Lbb + ^ht, 
then the probabihty ppi that a signal is a planet is the 
following: 

Ppi = r^- (21) 



This may be rewritten as 



x + y 



1 

l + y/x 



If ?/ <C X (as we have shown it typically is) then 



or 



FPP 



X 



(22) 



(23) 



(24) 



The typical role of BLENDER in this context can then be 
thought of as multiplying y by a factor we call /blender 
(0 < /blender < 1) that represents the fraction of the 
potential false positive scenarios (weighted by their in- 
trinsic likelihoods) that produce acceptable fits to the 
hght curve. Thus if BLENDER were to rule out 90% 
of the false positive scenarios considered in our analy- 
sis (/blender = 0.1) for a particular system, then that 
would decrease FPP for that system by a factor of 10 — 
such analysis would be enough to make FPP < 0.01 for 
almost every KOL 

In a similar spirit, for those KOIs whose FPP is domi- 
nated by the blended binary scenarios (mostly the shal- 
lowest signals), y can also be significantly decreased sim- 
ply if deep high-resolution imaging shows no potentially 
blending companions. This effect is illustrated in Figure 
8, where dotted FPP contours are drawn illustrating the 
effect of restricting the "blend radius" to 2". Decreasing 
this even further to 1" or smaller would give another fac- 
tor of 4 or more reduction in FPP. Thus we demonstrate 
that simply obtaining deep high-resolution images may 
be just as effective as the entire BLENDER analysis for 
probabilistically validating some KOIs! 

In some cases of course, follow-up imaging observa- 
tions will identify the presence of nearby stars within the 
"blend radius" inside of which astrometric offset methods 
were previously unable to identify displaced blends. In 
these cases, the analysis presented in this paper must be 
superceded by a more specifically tailored analysis such 
as BLENDER. In general a detected nearby blend will 
cause the prelimininary FPP to substantially increase, as 
the Pr(Blend) factor that we found to be of order ~0.10 
fi )2.1.ip is then divided out from the Lbb term, making 
it more comparable to the Lpi term. In these cases a 
full suite of follow-up observations and the more detailed 
wholistic approach that BLENDER utilizes will become 
necessary to validate candidates. 

6. CAVEATS AND CONCLUSIONS 

We present both a framework to analyze the a priori 
false positive probability (FPP) of Kepler planet can- 
didates and preliminary FPPs for the entire sample of 
1235 released candidates, finding that FPP < 10% for 
almost all the KOIs and <5% for over half the KOIs. 
The philosophy we adopt in this work is to calculate 



conservative upper limits to these FPPs; further anal- 
ysis may well demonstrate them to be lower, but we do 
not expect them to be higher. Thus we may say confi- 
dently say that our analysis indicates that fewer than 50 
of the 1235 candidates are likely to turn out to be false 
positives. 

However, these conclusions are based on several as- 
sumptions that come with some caveats: 

• We assume that all candidates have passed all pre- 
liminary false-positive- vetting procedures that are 
possible using Kepler photometry and astrometry 
alone. In particular we assume that the transits 
are not obviously V-shaped, there is no detectable 
secondary eclipse, and that careful centroid anal- 
ysis has not revealed the presence of a displaced 
blended binary. If photometry or astrometry for 
a candidate actually does turn out to indicate a 
possible false positive, then the FPPs calculated in 
this paper for that KOI are not accurate. 

• We assume host star stellar parameters according 
to the Kepler Input Catalog (KIC). If stellar radii 
or stellar types are found to be significantly differ- 
ent from the KIC estimates, then that could change 
the interpretation of transit signals (e.g. turning a 
Jupiter-sized planet into an M-dwarf). 

• We assume a planet radius function that increases 
towards smaller planets. There are many reasons, 
both theoretical and observational, to assume this 
is correct, but if it is not, then the false positive 
numbers for the smallest candidates would be a 
factor of two or so higher. 

We also emphasize that the intention of this paper is 
not to encourage other analyses to completely ignore the 
possibility that some Kepler candidates might be false 
positives. Rather, we suggest that in statistical analyses 
using the ensemble of KOIs to investigate the distribution 
of planet properties, the FPPs in this paper (or based on 
the calculations in this paper; e.g. with different assump- 
tions of the planet occurrence rate or radius function) be 
used to count "fractional planets"; i.e. for a KOI with 
FPP = 0.05 to count as 95% of a planet. 

Finally, we provide several suggestions to guide and 
optimize Kepler follow-up efforts, based on the results of 
our analysis: 

• For the shallowest candidates, or those for which a 
blended binary is the most likely false positive sce- 
nario, we recommend deep high-resolution imaging 
(with a target contrast ratio corresponding to the 
depth of the signal: Amif = —2.5 log 5), as exclud- 
ing the presence of potentially blending stars at 
close separation will be the quickest path toward 
validation of such systems. Contrast ratios up to 
10 magnitudes as close as 1" have long proven to 
be te chnically feasible ([Luhman fc Javawardhanal 
[200liBiller.,2007[ e.g.). 

• For candidates of intermediate depth for which a 
hierarchical eclipsing triple is the most likely false 
positive scenario we recommend follow-up efforts 
targeted toward the identification of physically 
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bound companions to the KOI. High-resolution 
imaging is one useful tool here (though not neces- 
sarily as deep as those observations targeting pro- 
jected binaries) to target wide-separation compan- 
ions, but high S/N spectroscopy (both optical and 
infrared) may be even more important, in order to 
spectroscopically identify or constrain the presence 
of low-mass stellar companions. 



false positives that have traditionally plagued transit sur- 
veys to be identified prior to follow-up observations. The 
re sult is that the majority of the candidates announced 
bv lBorucki et al.l (j2011t ) are likely to be bona fide planets. 
Thus, having surveyed the landscape of false positives in 
the Kepler field, we conclude that the outlook is bright 
for statistical analyses of exoplanet occurrence and prop- 
erties based on the data made public by the Kepler team. 



• For the candidates with the largest implied radii 
we recommend primarily spectroscopic follow-up to 
improve our knowledge of the physical parameters 
of the candidate host stars, in order to rule out the 
possibility of an eclipsing binary being misclassified 
as a transiting planet due to an incorrect assumed 
radius. 

In summary, the exquisite photometric and astrometric 
precision of the Kepler instrument enables many of the 
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APPENDIX 
BLENDED PLANETS 

In the present work we consider as false positives only astrophysical configurations that do not involve any planets but 
still mimic the signal of a transiting planet. However, there are various other scenarios involving "blended planets" that, 
while not strictly false positives (i.e. a transiting planet is still involved), may contribute significantly to uncertainty 
in the planet parameters derived from the transit signal. A "blended planet" for our purposes is a transit signal 
that appears to be a planet of a particular size transiting the target star but is actually a larger planet transiting a 
fainter blended star. As before, these scenarios can be divided into chance- alignment systems or physically associated 
hierarchical systems. 

We have calculated that chance- alignment blended planets are significantly less likely to occur than their blended 
stellar binary cousins; this can be heuristically understood from the following considerations: 



• Because the deepest intrinsic planetary transits have depths of only ~0.02 and the diluted signal has to be 
detectable (we have adopted S > 10"'' as a threshold), then the maximum contrast between the target star and 
the blending star is ArriK — 5.75, which is significantly less than the Atuk = 10 we adopted for blended binaries 
in ij2.1.1l The sky density of stars available for the chance- alignment blended planet scenario is thus about 5.5 
times lower than that for the blended binary scenario, according to the TRILEGAL simulations. 

• Our assumed planet frequency (^20%) is lower than our assumed binary fraction (~40%). 

• The largest planets, while the most amenable to causing the blended planet scenario because of their larger 
intrinsic transit depth, are the least common — only ^1% of solar- type stars host close-in g iant planets, and this 
occurrence rate is even lower for lower-mass stars (|Endl et al.l 120031 1 Johnson et al.l 12010 ) . which are the most 
common blending stars. 

Physically associated hierarchical planets, on the other hand, might well be relatively common compared to the 
stellar false positive scenarios or chance-alignment blended planets. Another way of saying this is that binary stellar 
systems are relatively common, and so it seems likely that a substantial fraction of Kepler targets (and therefore 
candidates) are in fact binaries of unknown architecture. The net effect of this on the interpretation of the sample 
of planet candidates will be additional uncertainty in the derived planet properties due to both diluting light from a 
binary companion and from possible stellar misclassification by the Kepler Input Catalog, which assumes each star is 
single. We note that the Kepler team does include blended planets in the BLENDER procedure, and in fact that such 
scenarios are often the most difficult to rule out (Kepler team, 2011, private comm.). 

In summary, while the analysis presented in this paper may provide confidence that "classic false positive" stellar 
systems are not often masquerading as Kepler transiting planet candidates, we do caution that uncertainties regarding 
candidate host systems (including whether or not they are binary) must be considered in any statistical analysis of 
the whole candidate sample. 
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